perm filename PAPER.PUB[2,TES] blob sn#009886 filedate 1972-07-28 generic text, type T, neo UTF8
00100	.SEC INTRODUCTION
00200	A study at the Stanford Artificial Intelligence Project (AI) has
00300	shown that it is more economical to prepare text on computer
00400	terminals than on typewriters for documents that are subject to
00500	revision at least once.  The AI Lab has an in-house PDP-10/50
00600	Time-Sharing system with about 40 terminals, nearly all of which are
00700	of the keyboard-display type.  To encourage and facilitate
00800	utilization of the computer in the publication process, the Lab
00900	provides text editing and formatting software and a variety of output
01000	media.
01100	
01200	Currently available for text editing are a teletype-oriented and a
01300	display-oriented editor.  Documents can be printed on a Model 37
01400	Teletype, a high-speed printer, or microfilm (the microfilm is
01500	prepared using FR-80 services purchased from a vendor).  For widely
01600	circulated reports, any of these media can be used to prepare offset
01700	masters.  Output via the Xerox Graphic Printer is to be implemented
01800	shortly, providing mixed user-definable type fonts and graphics.
01900	
02000	The term "text formatting" applies to the processing that follows
02100	interactive text editing and precedes document printing.  It includes
02200	justification, page numbering, section numbering, layout, footnote
02300	placement, and special capabilities such as index preparation and
02400	cross-referencing.  Although several text formatting systems are
02500	available for the PDP-10, the desire for additional capabilities led
02600	to the development of a new kind of program which is known as a
02700	"document compiler".  A prototype document compiler has been in use
02800	since the Fall of 1971; its acronym is "PUB" (PUBlication system).
02900	
03000	The input to PUB is a "manuscript" file, prepared using one of the
03100	available text editors.  The manuscript contains the unformatted text
03200	of the publication, plus commands and control characters that direct
03300	PUB in the formatting process.  The output of PUB is a "document"
03400	file, i.e., a disk file which can be printed on one of the available
03500	output devices by standard utility programs.
03600	
03700	PUB is called a "document compiler" because of several analogies
03800	between it and compilers for programming languages.  Within PUB is an
03900	Algol-like language featuring macros in which the user can process
04000	integer and character string data to achieve complex formatting
04100	operations.  Cross-referencing is achieved with the aid of "labels"
04200	very similar to the labels customary in programming languages.
04300	Automatic numbering of sections, figures, equations, footnotes,
04400	pages, and other entities is implemented using "counters" that are
04500	stepped and reset under control of a statement resembling the Algol
04600	FOR statement.
04700	
04800	Even to the time-sharing monitor PUB appears to be a compiler.  Its
04900	"source program" is the manuscript and its "object program" is the
05000	document.  Monitor facilities for rapid cycling through the
05100	edit-compile-execute loop of program development have been made
05200	available in the edit-compile-print loop of document preparation.
     

00100	.SEC PUB LANGUAGES
00200	PUB has both a ∪text ∪language and a ∪command ∪language.  Basic
00300	components of the text language are ∪characters, ∪words, ∪sentences,
00400	and ∪paragraphs.  The command language includes ∪numbers, ∪strings,
00500	∪variables, ∪expressions, ∪declarations, ∪statements, and ∪labels.  A
00600	document is programmed by coordinated use of these two languages.
00700	
00800	The text language adheres closely to informal conventions common in
00900	the preparation of manuscripts for publication.  A word usually ends
01000	at a space or carriage-return, a sentence at a period, question mark,
01100	or exclamation mark, and a paragraph at a blank line.  A programmer
01200	can specify alternate conventions if so desired.
01300	
01400	During output of the document, the amount of space left between words
01500	and sentences is controlled by various mode settings and is subject
01600	to expansion by a uniform justification algorithm.  Paragraph layout and
01700	indentation are specified by declarations of the command language.
01800	Intra-line formatting operations such as underlining and subscripting
01900	are specified by text control characters designated by the programmer.
02000	
02100	Each line of the manuscript that begins with a specified character
02200	in column 1 is a "command line".  The Period is the character that
02300	normally serves this function, but like all control characters,
02400	it may be changed by declarations of the command language.
02500	A command line generally contains command language information, but
02600	it is possible to switch to text language by use of the delimiter
02700	"}" (right curly bracket).  In text language, it is possible to
02800	switch back to command language by use of a designated control
02900	character.  The recommended character to serve this function is
03000	"{" (left curly bracket).
03100	
03200	Each line of the manuscript that does not have the Period character
03300	in column 1 is a "text" line.  A text line generally contains text
03400	language information, but it is possible to switch to command
03500	language using the "{" control character, and to switch back to
03600	text with the "}" delimiter.
03700	
03800	An important statement of the command language is the ∪computed ∪text
03900	statement.  Syntactically, it is any variable, constant, or
04000	parenthesized expression that occurs in isolation; most frequently,
04100	it occurs between curly brackets as a brief command embedded in
04200	a text line.  The variable, constant, or parenthesized expression is
04300	evaluated, and its character string value is inserted into the
04400	document output.  An example of the use of computed text is shown
04500	below:
04600	.B
04700		.VERSION ← 6 ;
04800		Fidjel Report, version no. {VERSION}, created {DATE}.
04900	.E
05000	The statement "VERSION ← 6" assigns the value "6" to the variable
05100	VERSION.  The next line of text includes two computed text statements.
05200	The first outputs the value of the variable VERSION; the second
05300	outputs the value of the variable DATE, which is automatically computed
05400	by PUB.  If the above manuscript were compiled on March 8, 1973, the
05500	output produced would be:
05600	.B
05700		Fidjel Report, version no. 6, created March 8, 1973.
05800	.E
     

00100	.SEC MACROS
00200	A sequence of PUB commands which is repeated throughout the manuscript can
00300	be abbreviated by use of the macro facility.  For example, a typical
00400	sequence that occurs at the beginning of each section or chapter is:
00500	.b
00600	α.NEXT PAGE ; NEXT SECTION ;
00700	.e
00800	These commands force output to a new page, count up the page number,
00900	and count up the section number.  They can be incorporated into a macro
01000	declaration as follows:
01100	.b
01200	α.MACRO SEC ⊂ NEXT PAGE ; NEXT SECTION ; ⊃
01300	.e
01400	Once this macro has been declared, it can be invoked by name in any
01500	command line:
01600	.b
01700	α.SEC
01800	.e
01900	PUB expands the macro and performs the indicated operations.
02000	
     

00100	.SEC LABELS AND CROSS-REFERENCES
     

00100	.SEC FRONT AND BACK MATTER
     

00100	.SEC COUNTERS
     

00100	.SEC SECTIONING
     

00100	.SEC PAGE LAYOUT
     

00100	.SEC IMPLEMENTATION
     

00100	.SEC DISADVANTAGES
     

00100	.SEC PLANNED IMPROVEMENTS